منابع مشابه
Arabic Dialect Identification
The written form of the Arabic language, Modern Standard Arabic (MSA), differs in a nontrivial manner from the various spoken regional dialects of Arabic – the true “native languages” of Arabic speakers. Those dialects, in turn, differ quite a bit from each other. However, due to MSA’s prevalence in written form, almost all Arabic datasets have predominantly MSA content. In this article, we des...
متن کاملVerifiably Effective Arabic Dialect Identification
Several recent papers on Arabic dialect identification have hinted that using a word unigram model is sufficient and effective for the task. However, most previous work was done on a standard fairly homogeneous dataset of dialectal user comments. In this paper, we show that training on the standard dataset does not generalize, because a unigram model may be tuned to topics in the comments and d...
متن کاملArabic Dialect Identification in Speech Transcripts
In this paper we describe a system developed to identify a set of four regional Arabic dialects (Egyptian, Gulf, Levantine, North African) and Modern Standard Arabic (MSA) in a transcribed speech corpus. We competed under the team name MAZA in the Arabic Dialect Identification sub-task of the 2016 Discriminating between Similar Languages (DSL) shared task. Our system achieved an F1-score of 0.5...
متن کاملSentence Level Dialect Identification in Arabic
This paper introduces a supervised approach for performing sentence level dialect identification between Modern Standard Arabic and Egyptian Dialectal Arabic. We use token level labels to derive sentence-level features. These features are then used with other core and meta features to train a generative classifier that predicts the correct label for each sentence in the given input text. The sy...
متن کاملSpoken Arabic Dialect Identification Using Phonotactic Modeling
The Arabic language is a collection of multiple variants, among which Modern Standard Arabic (MSA) has a special status as the formal written standard language of the media, culture and education across the Arab world. The other variants are informal spoken dialects that are the media of communication for daily life. Arabic dialects differ substantially from MSA and each other in terms of phono...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Linguistics
سال: 2014
ISSN: 0891-2017,1530-9312
DOI: 10.1162/coli_a_00169